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Abstract — We consider the problem of blind identification and 
equalization of single-input multiple-output (SIMO) nonlinear 
channels. Specifically, the nonlinear model consists of multiple 
single-channel Wiener systems that are excited by a common 
input signal. The proposed approach is based on a well-known 
blind identification technique for linear SIMO systems. By 
transforming the output signals into a reproducing kernel Hilbert 
space (RKHS), a linear identification problem is obtained, which 
we propose to solve through an iterative procedure that alternates 
between canonical correlation analysis (CCA) to estimate the 
linear parts, and kernel canonical correlation (KCCA) to estimate 
the memoryless nonlinearities. The proposed algorithm is able 
to operate on systems with as few as two output channels, on 
relatively small data sets and on colored signals. Simulations 
are included to demonstrate the effectiveness of the proposed 
technique. 

Index Terms — Wiener systems, SIMO nonlinear systems, blind 
identification, kernel canonical correlation analysis. 

I. Introduction 

BLIND identification and equalization have been active 
research topics during the last decades. In digital com- 
munications, blind methods allow channel identification or 
equalization without the need to send known training signals, 
thus saving bandwidth. While a lot of attention has gone to 
the analysis of linear systems, many real-life systems exhibit 
nonlinear characteristics. As a result, the field of nonlinear 
system identification has been studied for many years and still 
remains a very active research area (T\, IS), |[3l, |]4]. 

In contrast to linear systems, which can be identified 
uniquely by their impulse response, there does not exist 
a corresponding canonical representation for all nonlinear 
systems. Hence, different approaches are followed to param- 
eterize different subclasses of nonlinear systems, including 
descriptions such as Volterra [5 1 and polynomial fE\ systems. 
While these techniques allow for adequate representations 
of many nonlinear systems, the number of parameters they 
require becomes excessive for high degrees of nonlinearity 
or high input dimensionality. Therefore, several authors have 
considered approximating the unknown nonlinear systems as 
simplified block-based models, including Wiener systems fl], 
im, which comprise a cascade of a linear filter and a mem- 
oryless nonlinearity; Hammerstein systems ||9|, ifTOI . which 
correspond to the inverse configuration; and combinations 
of both ifTTl . lfT2]| . |[13|. We will focus on Wiener systems. 
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Fig. 1. The block diagram of a SIMO system consisting of two Wiener 
systems. The proposed method aims to identify the entire system and to 
estimate s[n] given only xi[n] and X2[n]. 



which, despite their simplicity, have been used successfully 
in applications including biomedical engineering lfT4l . control 
systems |15|, digital satellite communications lfT6l . digital 
magnetic recording ifTTl . optical fibre communications ifTSll 
and chemical processes lfT9l . 

A considerable number of techniques have been proposed in 
recent years to tackle the problem of supervised identification 
of Wiener systems, both single-input single-output (SISO) 
systems [8], |20|, I21J . [22 J and multiple-input multiple-output 
(MIMO) systems |l23l, lIlD, ||23. Nevertheless, relatively 
little research has been conducted on the blind identification 
problem. For SISO Wiener systems, some blind identification 
techniques have been proposed that make assumptions on the 
input signal statistics (see (22! Part IV]). In particular, in 
1.26] . [27] the input signal is required to be independent and 
identically distributed (i.i.d.) and Gaussian. A less restrictive 
approach was followed by Taleb et al. in 1281 . where the input 
signal is only required to be i.i.d. 

The problem of blind identification of nonlinear single-input 
multiple-output (SIMO) systems has also been addressed, 
although only for the class of Volterra models. The SIMO 
model can be obtained for instance by measuring a single 
source using a sensor array. In |29| it was shown that a 
finite impulse response (FIR) linear filter can perform zero- 
forcing (ZF) equalization of SIMO Volterra systems under 
certain conditions. In f3Q\ a different technique was proposed 
for blind equalization of SIMO Volterra models, based on 
second-order statistics (SOS), that improved several aspects 
of |29|, including computational complexity and robustness. 
Both methods require at least three output channels to operate. 

In this paper we will focus on the blind identification 
and equalization of SIMO Wiener systems, as depicted in 
Fig. [T| We propose a blind technique that requires looser 
restrictions than blind SISO techniques, and that is able to 
operate with two or more output channels. The proposed 
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identification approach is based on a well-known technique 
in blind identification of linear SIMO systems lISTTl . ll32l . 
For SIMO Wiener systems the blind identification problem 
is more challenging, as it includes nonlinearities. By drawing 
on the framework of kernel methods, however, the problem 
can be linearized. Some preliminary results of the proposed 
method were presented in ll33l . We extend these results with 
an identifiability analysis, a general formulation for multiple 
outputs, a formulation that exploits identical nonlinearities in 
each channel and more exhaustive numerical experiments. 

The rest of this paper is organized as follows: Section [ll] 
gives a brief review of blind identification methods for linear 
SIMO systems based on SOS. This scenario is extended to 



SIMO Wiener systems with two output channels in Section III 
and generalized to systems with multiple outputs in Section 
IV Section [V] contains a series of numerical experiments, and 
the main conclusions of this work are presented in Section [Vl| 
Throughout this paper the following notation is used: Scalar 
variables are denoted as lowercase letters, x, and vectors 
as boldface lowercase letters, x, defined as column vectors. 
Matrices are indicated by boldface uppercase letters, such as 
X. Square brackets denote the instance of any variable at time 
n, or the n-th element of a matrix or vector, x[n], and a hat 
denotes an estimate of a variable, x. 

II. Blind Identification of Linear SIMO Systems 

We start by reviewing the basic blind identification problem 
of a linear system with two outputs. The extension to multiple 
outputs is straightforward, as shown in 1131 1 and 1321 . The 
signals used in this paper are real, although the proposed 
methods can be easily extended for complex signals. 

Consider a system that consists of two linear channels hi 
and h2 that share the same zero-mean input signal, s[n], as 
depicted in Fig. |2] Assuming FIR channels, the output of the 
i-th channel can be written as 



Xi\n\ 



L-l 
1=0 



l]s[n — I] = hi[n] * s[n], 



where hi = [ft,i[0], . . . , hi[L — 1]] denotes the impulse re- 
sponse vector of the i-th channel, L is the maximal channel 
length (which is assumed to be known), and hi[n] * s[n] is the 
convolution between h^ and the input signal s[n]. This system 
can be obtained for instance by oversampling a single linear 
channel when the source signal has some excess bandwidth, 
which is the bandwidth occupied by the signal beyond the 
Nyquist frequency l/2Ts (see |34, Section 9.2.1]). 

The identification method presented by Xu et al. in ||3T1 . 
which is closely related to linear prediction, exploits the 
commutativity of the convolution, in particular 

* * s[n]) = hi[n\ * {h2[n\ * s[n\). 

This property inspired the design of the identification diagram 
shown in Fig.|2] which allows to find estimates of the channels, 
hi and h.2, by minimizing the following cost function 
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linear SIMO system Identification diagram 

Fig. 2. A linear SIMO system and the corresponding blind identification 
diagram. If hi = hi and h2 = h2, the output error e[n] will be zero. 



with respect to hi and h2, where 

zi[n] = h2[n] xi[n] = /i2[n] * {hi[n\ *s[n]), 

and Z2 [n] is constructed in a similar fashion. 

In order to solve this minimization problem, we define the 
data matrix 



X, 



XiU 



' Xi[n + L — 1] 
x,[n + N -1] ■■■ Xi[n + N - L] 



i = l,2. 



By denoting the estimate of the channel impulse response 
vectors as 

^k[0],...,k[L-l] 

it can be easily verified that in a noiseless case the solution 
should satisfy 

Xih2=X2hl, (2) 

as illustrated in the identification diagram of Fig. |2] Correct 
identification is guaranteed when the channels h^ do not share 
any common zeros and the linear complexity of the input 
sequence is sufficiently high OTl . For real- world signals this 
is generally satisfied, see [31]. 

In case the outputs Xi[n] are corrupted by additive noise, 
Eq. (|2]) cannot be fulfilled in general, and the optimal filters 
hi and h2 need to be determined by solving an optimization 
problem. In order to avoid the zero-solution h; = 0, either 
the norm of the filters h^ or the norm of the output signal 
Xihj is typically fixed. A restriction on the filter norm was 
used in I.31J to develop a least-squares (LS) method. With this 
restriction, the minimization problem ([T} becomes 



minimize 

hi,h2 



Xih2-X2hi 



subject to 1 1 hi I p 
Its solution is obtained by solving the eigenvalue problem 

X^X2 

-XfX: 



— X^Xi 
Xl^Xi 



= 1. 



h = ph, 



(3) 



in which h = [h!f,hj]^ is found as the eigenvector corre- 
sponding to the smallest eigenvalue. 
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If a constraint is applied on the output signal energy (as in 
II32I ). the following optimization problem is obtained 



nunirmze 

hi,h2 



Xih2 - Xahi 



(4) 



subject to ||Xih2| 



IXshil 



Problem Q is a canonical correlation analysis (CCA) problem, 
whose solution is given by the principal eigenvector of the 
following generalized eigenvalue problem (GEV) (see 1351 ). 



X^Xi' 
X^Xs 



P 



Xj X2 






XfXi 



(5) 



Note that both the LS and CCA-based algorithms require 
knowledge of the maximum channel length L. More generally, 
the following assumptions are required in order to guarantee 
identifiabihty lf3Tl : 

Al. The linear channels are coprime, i.e. they do not share 

any common zeros. 
A2. The linear complexity of the input signal is at least 2L + 

1, where L is the maximum length of the hnear channels. 

Once the channels hi and h.2 have been estimated by 
solving either one of the eigenvector problems ([3]) or (|5]l, 
system equalization can be performed by applying the zero- 
forcing (ZF) or the minimum mean-square error (MMSE) 
algorithm. For the proposed technique we choose to work 
with the constraint based on the output signal energy, and 
its corresponding CCA formulation (|5]l, since it reduces the 
noise enhancement problem, especially in the case of colored 
signals or a small number of observations 



III. 



Blind Identification and Equalization of 
SIMO Wiener Systems 



The problem of interest consists of nonlinear SIMO system 
identification, in which each channel is modeled as a Wiener 
system. This model can be obtained by using a sensor array 
at the receiving end, given that each individual sensor allows 
to be represented as a Wiener system. In accordance to the 
nomenclature used in the literature we call this system a SIMO 
Wiener system. Fig. [T] displays a system with two outputs, as 
encountered for instance in OTI . The output of the i-th channel 
is obtained as 

/L-l \ 



We will restrict the nonlinearities fi{-) to be monotonic 
and invertible in this work, since the proposed identification 
method is based on estimating the inverse nonlinearities. 
This restriction is fulfilled in many practical scenarios, for 
instance when the nonlinearities are modeled as saturating 
nonlinearities (which is the case for saturating amplifiers, limit 
switch devices in mechanical systems and overflow valves 
among others — see examples in [221 ). 

Before describing the details of the proposed method we 
discuss the identifiabihty conditions of this system. 



A. Blind identifiability 

We start by pointing out some ambiguities that need to 
be taken in mind when identifying a SIMO Wiener system. 
Throughout this discussion it is understood that any identifi- 
cation solution is only given up to a set of scalar constants, 
which represent scalings of its unknown internal signals yi [n] 
and its source signal s[n]. Furthermore, the linear channels 
of the SIMO Wiener system should be of length L > 1. 
A system with L = 1 represents a degenerate case, as it 
is impossible to identify its nonlinearities: For instance, any 
monotonic transformation d{-) of its source signal would 
allow to construct a different SIMO Wiener system that has 
different nonlinearities, fi{9^^{-)), while having the same 
output signals. 

An important observation is that the described system is 
not identifiable in general when the input signal is of finite 
length. In order to prove this statement we will make use 
of the concept of amplitude order, based on order statistics: 
Given a sequence of samples, we define the amplitude order as 
the order of these samples when they are sorted ascendingly, 
where samples with identical values are given the same order 
For instance, the amplitude order of the sequence [1, 4, 3, 3] is 
[1,3,2,2]. An interesting property is that if two sequences 
have the same amplitude order there exists a monotonic 
function 6{-) that transforms one sequence into the other one. 

Lemma 1. A SIMO Wiener system with a monotonic invertible 
nonlinearity is not identifiable in general if its input signal s[n\ 
is of finite length. 

Proof: We first show that a SISO Wiener system is not 
identifiable in general for finite N. The proof is given by a 
simple counterexample. Denote by h the linear channel of a 
given Wiener system, by /(•) its nonlinearity and by y[n] its 
intermediate signal. Consider 6 to be the minimal distance 
between any two consecutive ordered samples y[n]. Since the 
input signal s[n] to this system is of finite length, we can 
assume that 6 will be small but non-null. 

Now consider an alternative Wiener system with input signal 
s[ti] — s[n]+e[n], linear channel h+v and intermediate signal 
y[n], where e[n] represents a perturbation signal and 1/ is a 
channel perturbation that is not a scaling of h. It is clear that 
by choosing the perturbations small enough w.r.t. S, but not 
zero, the amplitude order of each y[n] can be made identical to 
the one of its corresponding y[n]. Therefore, y[n] and y[n] can 
be transformed one into the other through a function y[n] — 
6{y[n\). By choosing the nonlinearity of the alternative Wiener 
system to be f{0^^{-)) a Wiener system is obtained that is 
different from the given system but whose output sequence 
is identical. Hence, the given Wiener system is not uniquely 
identifiable. 

The previous counterexample can be applied to each branch 
of a SIMO Wiener system independently. Therefore, if no 
additional assumptions are made, a SIMO Wiener system is 
not uniquely identifiable when its input signal has finite length. 

■ 

While Lemma [T| may seem discouraging, it requires to 
be put in a practical perspective. As the previous example 
shows, the norm of the allowed perturbations depends on the 
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differences S between consecutive ordered samples. If the 
length N of the input signal grows and the nonlinearities 
become completely excited in their ranges, it is reasonable 
to assume that the (5-values will shrink. As a result, the norm 
of the allowed perturbations will shrink as well, and hence 
the identification error reduces. In the limit case of — oo 
the system becomes completely identifiable. Note that the 
reason the system is not identifiable in theory for finite A^ 
is the unrestricted flexibility of the nonlinearities, represented 
by d{-) in the example. If this flexibility is somehow limited, 
though, identifiability becomes possible. In a practical scenario 
this is generally true, as can be motivated by the principle 
of parsimony. Therefore, as long as the nonlinearities are 
sufficiently smooth, it is possible to identify a SIMO Wiener 
system using only a finite number of samples. We will model 
the nonlinearities as non-parametric kernel expansions (see 
Section |III-C| l, which allow to impose different degrees of 
smoothness on the nonlinearities without limiting their shape 
to any particular model. 

Based on the previous discussion we can formulate a set 
of assumptions, in addition to Al and A2, that guarantee 
identifiability in most practical situations: 
A3. L > 1; 

A4. The nonlinearities are invertible and monotonic; 
A5. A^ ^ 1 and each /j(-) is sufficiently excited in its range. 
Appropriate values of A^ depend on each scenario individually. 
Specifically, the smoother the nonlinearities of the system, the 
lower A^ can be. As we will see in Section [Vj relatively small 
sample sizes are sufficient in practice. 

B. Outline of the proposed method 

We now describe the proposed blind identification method, 
starting with the two-channel Wiener system. The proposed 
identification diagram, which has the structure of a multiple- 
input single-output (MISO) Hammerstein system, is pictured 
in Fig. |3] In particular, since the nonlinearities /i(-) of the 
SIMO Wiener system are assumed to be invertible, they 
can be canceled out by applying the inverse nonlinearities 
= /j^^(-) to the system outputs Xi[n\. If the nonlinearities 
were known, the problem would reduce to identifying the 
linear channels hi and h.2, which is achieved by applying 
either one of the discussed linear techniques. However, since 
the nonlinearities fi{-) are also unknown, they need to be 
estimated jointly with the linear part. 

Similarly to the linear scenario, we define the cost function 

N N 

J = i^Y^V^M- ^2[n]\^ = ^Y.\^H\\ (6) 

n—l n—1 

which uses the identification diagram outputs, defined as 

L-l 

Zi[n] = Y,h2[l]9i{xi[n^l]), (7) 

and equivalent for Z2[n]. The minimization of Eq. (|6]l rep- 
resents a nonlinear optimization problem, which is generally 
hard to solve. In order to avoid trivial solutions such as the 
zero-solution and overfit solutions caused by an excessive 



flexibility of the nonlinearities, the solutions will require to be 
restricted in several ways. We will resort to the framework of 
kernel methods to implement these restrictions and to linearize 
the problem. 

C. Kernel methods 

Kernel methods are powerful nonlinear techniques based 
on a nonlinear transformation of the data x into a high- 
dimensional reproducing kernel Hilbert space (RKHS), in 
which it is more likely that the transformed data $(x) is lin- 
early separable. In this feature space, inner products can be cal- 
culated by using a positive definite kernel function satisfying 
Mercer's condition |38, Chapter 5]: k(x, x') = (<l>(x), $(x')). 
This simple and elegant idea, also known as the "kernel trick", 
allows us to perform inner-product based algorithms implicitly 
in feature space by replacing all inner products by kernels. 
The solution of the resulting linear problem in feature space 
then corresponds to the solution of the nonlinear problem in 
the original space. Common kernel-based algorithms include 
support vector machines (SVM) ||38l Chapter 5] and kernel 
principal component analysis (KPCA) ||39l . 

Thanks to the Representer theorem |l40l, a large class of 
optimization problems in RKHS have solutions that can be 
expressed as kernel expansions in terms of the available data. 
Specifically, it allows us to model a nonlinearity g(-) as 

N 

2/ = 5W = ^ aHK(x,x[n]). (8) 

n=l 

where {x[n]|n = 1, . . . , A^} are the training data. It has been 
shown that this expansion acts as a universal approximator 
iBTI for sufficiently rich kernels such as the Gaussian kernel, 

k(x,x') =exp(-||x-x'||V2cr2)- 

For a given set of A^ input-output data pairs (x[?i], ?/[«]), 
Eq. (|8]l can be written in matrix form as 

y - Ka, (9) 

where y = [y[l], . . . ,y[N]f , a = [a[l], . . . ,a[N]f , and 
K e M^^^ is the kernel matrix with elements 

=«;(xW,x[j]). (10) 

As we will see in the sequel, smoothness constraints can 
be imposed on the represented nonlinearity by restricting the 
norm of a. First, though, we outline the proposed optimization 
problem using kernel expansions to represent the estimated 
nonlinearities. 

D. Proposed optimization problem 

Consider the output zi [n] of the first branch of the proposed 
identification scheme of Fig. [3] By introducing the kernel 
expansion (|8| into Eq. (|7|, it can be written as 

L-l N 

^1 W — h2[l]Ki[n — I, m]ai[m], (11) 

/=o m=l 
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SIMO Wiener system Identification diagram 

Fig. 3. A SIMO system consisting of two Wiener subsystems, followed by the proposed identification diagram in the form of a MISO Hammerstein system. 



where Ki is the kernel matrix of the available data 
xi[n] of this branch. The entire output vector zi = 
[zi[l], zi[2], . . . , 2:i[Af]]-'" can thus be written as 

zi = Kir2, 

in which the elements of Ki G are defined as 

Ki [n, IM + m] = Ki [n — /, m] , 

and Y2 represents the Kronecker product of h.2 — 

[h2[0\, . . . Ml - W and ai = [ai[l], . . . , ai[7V]]^, 



r2 = h2 



Oil . 



After obtaining a similar expression for the second output 
channel, i.e. Z2 = K2ri, the linear optimization problem Q 
is extended for SIMO Wiener systems as 



minimize ||Kir2 — K2ri|| 

ri,r2,hi,h2,ai,a2 

subject to ||Kir2|p = ||K2ri|P = 1 

r2 = h2 ® ai 
ri = hi ® a.2. 

For simplicity, we denote this problem as 

2 



(12) 



minimize 

hi,h2,Q;i,Q;2 

subject to 



Zl - Z2 



|Z1|P = ||Z2|P = 1, 



(13) 



where we have omitted the trivial dependency of zi and Z2 
on hi, h2, OLi and ai, see Eq. ( [TT| i. 



E. Alternating optimization procedure 

The optimization problem ( [T3| ) is not convex and generally 
hard to solve. However, if a.i and 0.2 were available, this 
problem would reduce to the easier problem Q. Equivalently, 
if hi and h2 were known, a similar reduction would lead to 
another optimization problem of the form of Q that would 
yield solutions for a.i and 0.2- This suggests an iterative 
scheme that alternates between updating the estimates of 
the linear channels h, and the nonlinearity estimates a^. 
Convergence is guaranteed because each update may either 
decrease or maintain the cost [42 J, [?]. 



1) Iteration 1: given a,, obtain h^.- If estimates of ai and 
a.2 are given, the output zi[n] is 



L-l 



(14) 



in which the elements yi[n] are obtained with Eq. ([8]). In 
matrix form, Eq. ( [T4] i becomes 

Zl = Yihs, 

where the n-th row of the matrix Yi contains the elements 
from yi[n] until yi[n + L — 1]. This allows us to rewrite the 
minimization problem ( [T3| ) as 

1^ 

(15) 

subject to ||Yih2|p = ||Y2hi|p = 1. 

This problem is equivalent to the CCA problem Q, whose 
solution is found by solving the GEV (|5]l |43|. 

2) Iteration 2: given h^, obtain a^.- If estimates of hi and 
h2 are given, we obtain 



minimize 

hi ,h2 



|Yih2-Y2hi| 



zi[n\ 



N 

E 

m — 1 



Wi [n, m]ai [m] 



where the auxiliary variable 



L-l 



Wi[n,m] = V h2[l]Ki[n-l,m] 



1=0 



(16) 



(17) 



is introduced. In matrix form, Eq. ( [T6] l can be written as 
Zl = Wifii, 

with Wi e R^^^ . By doing so, the minimization problem 
( [T3] ) becomes 



minimize ||Wiq:i — W2Q;2 

CKl ,a2 

2 



subject to 1 1 Wi 6:1 1 



(18) 



W2Q:2| 



which establishes a kernel CCA problem that accounts for the 
estimation of the nonlinearities gi{-). The solution is found by 
solving the associated GEV, which is similar to (|5]l. 

F. Computational issues and regularization 

We now discuss some of the computational issues that need 
to be solved to guarantee that the proposed procedure performs 
correctly and efficiently. 
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1} Low-rank approximations: Solving a GEV generally 
requires cubic time and memory complexity in terms of the 
involved matrix sizes, i.e. 0{N^). Accordingly, if the training 
set is large, the GEV for Eq. ( [T8] l will pose very large 
computational requirements. Interestingly however, kernel ma- 
trices usually have a quickly decaying spectrum ll44l . B31 . 
which allows to approximate them reliably by a low-rank 
decomposition of the form 



(19) 



where G e M'^^*^ and M < N. Following Eq. Q, we obtain 

ji = GiO.,, (20) 

where a.i now contains a reduced set of M expansion coeffi- 
cients. 

The auxiliary variables defined in Eq. ( fTT) ! are replaced by 



L-l 
1=0 



(21) 



The new matrices have dimensions N x M, which reduces 
the complexity of the GEV for Eq. ^ to 0{M^). Several 
methods have been proposed to retrieve suitable kernel matrix 
decompositions in O^NlvP) time, most notably Nystrom 
approximation B4l . sparse greedy approximations ||451 . and 
incomplete Cholesky decomposition (ICD) ll46l . We will use 
the latter in the simulations. 

2) Data centering: An important requirement of CCA is 
that the input data be centered. For KCCA, this translates into 
the need to center the data in feature space |46|. While it is 
hard to remove the mean explicitly from the transformed data 
$(a;[n]), the approximate kernel matrix GG^ can be centered 
easily in feature space by performing the transformation 



G^II-lllG 



(22) 



where I is the unit matrix and 1 is an iV x all-ones matrix. 
This operation simply removes the column means from G, 
and can thus be implemented without explicitly calculating 
any N x N matrices ||4T). 

3) Regularization: If any of the matrices is invertible, 
the GEV ( [T8] l does not yield a useful solution as it allows 
to find perfect correlation between any two data sets. This 
is a standard issue in KCCA that stems from the unbounded 
flexibility of the nonlinearities, which is a property we seek to 



avoid (see Section IIl-A i. A straightforward fix is to regularize 
the flexibility of the projections a.i by penalizing their norms, 
as follows ifel. Il43l: 



minimize || WiCti — W2Q;2|1 

0-1,0.2 

subject to ||Wiq;i||^ + c||ai||^ = 1 
|lW2a2|P + c||a2|P = l, 



(23) 



where c is a small regularization factor. This yields the 
following GEV, which combines low-rank approximation and 



Algorithm 1 Alternating KCCA (AKCCA) for Blind Equal- 
ization of SIMO Wiener Systems 

input: Output data sets Xiln] of the Wiener system. 

Obtain the decomposed kernel matrices Gi, see Eq. ([T9]l. 

Center G^ with Eq. p2| ). 

initialize: Set yi[n] — Xi[n] and construct Y^. 

repeat 

CCA: With given Yj, update hj by solving Eq. ( [TS] ). 
With new h^, update as in Eq. ( |2T] i. 
KCCA: With given W^, update dci by solving Eq. ( |23] l. 
With new a-i, update as in Eq. ( |20] i and construct Y^. 
until convergence 

Apply linear ZF or MMSE equalization on yJ"] ^i- 
output: s[n] 



regularization 



VV^fW2 



W^Wi 









Oil 





Wf Wi + cl 






P 


W^W2 + cl 




a.2 



(24) 



G. Initialization and algorithm overview 

Analogously to many other iterative techniques, the pro- 
posed cyclic minimization algorithm could suffer from local 
minima. In practice, local minima can be avoided by means 
of a proper initialization technique. A straightforward initial- 
ization consists in estimating the initial nonlinearities as the 
identity function gi{x) ~ x, and obtaining the initial estimate 
of the linear channels by solving the linear CCA problem 
Q for the system outputs Xi [n] . 

In case a more accurate initialization is required, the opti- 
mization problem ( [T2] i can be solved directly with respect to 
ri and V2, after making the necessary modifications to take 
into account regularization and low-rank decompositions. The 
Kronecker structure can be forced a-posteriori onto the esti- 
mates of ri and r2 by applying singular value decomposition 
(SVD) on them (specifically, on the M x L matrices that are 
obtained by ordering their elements column-wise). 

The entire alternating technique for two output channels 
is summarized in Alg. [T] We denote this technique as al- 
ternating kernel canonical correlation analysis (AKCCA). 
Assuming L < M, the computational complexity of a single 
iteration of this algorithm is dominated by the KCCA prob- 
lem. In particular, constructing its matrices and solving the 
GEV ^ require 0{NM) and 0{AP) time, respectively. 
If more than two output channels are present, the proposed 
algorithm follows exactly the same course, although the used 
formulae require to be extended, as shown in the sequel. 
A Matlab implementation of AKCCA can be obtained at 
http://gtas.unican.es/people/steven 

IV. Extensions 
A. Algorithm for systems with multiple outputs 

The proposed algorithm for systems with 2 outputs can be 
extended to systems with an arbitrary number of outputs, say 
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h2 
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'^io L' "J 








ha 


— 2:23 w 

— Z2i[n] 


hi 






hi 


2^31 W 

— Z32 [n] 


ha 



0- 



( 















SIMO Wiener system Identification diagram 

Fig. 4. A SIMO system consisting of three Wiener subsystems, and the proposed identification diagram. 



ei2[«] 
631 M 

623 [n] 



P, in a straightforward fashion. The problem ( [T3] l is extended 
from 2 to P outputs as 



minimize 



subject to 



P 

E 



P 



(25) 



where z^- — . . . , ^^^[iV]]-^ contains the signal Zij\ri\ 

obtained by transforming the output signal Xi\n\ by gAA and 
filtering it by hj. The identification diagram of Fig. 4 illus- 
trates this optimization diagram for the case of P = 3. While 
the energy restriction of problem ( p5| ) is slightly different 
compared to the restriction of the problem ( pjj ) for 2 outputs, it 
was shown in ||32]| that they are equivalent for these problems. 

The optimization problem ( |25| ) can be solved by extending 
the iterative technique of Alg. [T]to multiple channels. To this 
end, the problems ( [T5| ) and (|23j in Alg.[T]require to be replaced 
by their multi-channel equivalents: 

1 ) Iteration 1: Given estimates of a^, a set of new estimates 
of h.i is found by solving 



(26) 



minimize || Y,;hj — Yjhijp 
hi, ....hp 

P 

subject to ||Y,;hj||^ = 1, 

where the elements of the matrices are obtained through 
( |20| i. The solution of the minimization problem ( |26] l can be 
found as the principal eigenvector of the GEV 



in which 



Rn 



Rah — pDcjh, 

YjYi .. 
Yf Y2 

Yf Yp Yl'Yp • ■ 



Y?Yi 
Y?Y2 



(27) 



(28) 



Dq; is a block-diagonal matrix whose «-th block on the 
diagonal is Ef=iy5^» YjY^, i.e. 



■y-P Y^Y- 



(29) 



••• e;=yyjy, 

and h contains the different estimated filters h — 

[hf,hi^,...,h?F. 

2) Iteration 2: Subsequently, the estimates of h,; are fixed 
and new estimates of a.i are obtained by solving 

p 

minimize \^ 1 1 j a. i — W j a j 1 1 ^ 

ai,...,Q;p .^—^ 

p (30) 
subject to ^ llWya.lp + IjaJ^ = 1 

i,j=l i 

where the auxiliary variables Wij are defined as 

L-l 

Wij [n, m] = hj [l]Gi [n — /, to] . 

Again, the minimization problem ([30| can be solved by 
retrieving the principal eigenvector of the corresponding GEV, 
which is found as 

in which 

WfaWsi ••• 

W|;Wi2 



(31) 



Rc 



WfpWpi- 

W^pWp2 



, (32) 



is a regularized block-diagonal matrix whose i-th block 
on the diagonal is I]f=i;j^i W^-W^ , i.e. 



■E.f.2wr,w 



Ij vv ij ■■■ 



Ef=/w?^.wp,, 



cl 

(33) 
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TABLE I 

Impulse responses of the linear channels used in the 
simulations. 



i 


h,[0] 


h^ll] 


hi[2] 


h,[3] 




1 


0.4115 


0.4165 


0.2249 


-0.0233 


-2.1971 


2 


-0.5734 


0.1021 


-0.1259 


-0.4176 


0.6657 


3 


1.4255 


0.6457 


-0.9509 


-0.1657 


-0.2512 


4 


0.2846 


-0.3880 


0.5373 


0.7983 


0.4093 


5 


-0.8769 


-0.3056 


-0.1160 


0.8130 


-0.8007 



and a = [a^ ,a.2,--, CKpJ ■ 

B. Algorithm for systems with identical nonlinearities 

In some cases it is known a priori that the P nonhnearities 
) are identical, for instance if the SIMO Wiener system is 
obtained by oversampHng a SISO Wiener system. The vaHdity 
of the oversampled model follows from the fact that the SISO 
system's nonlinearity is memoryless, and thus it applies to the 
signal j/[ri] on a sample-by-sample basis. Therefore, it does 
not matter if one oversamples the internal signal y[n] (similar 
to the linear case of Section |ll]i or the output signal x[rL\. 

The knowledge that the nonlinearities are identical can be 
exploited to obtain a more accurate estimate. Specifically, the 
data y — [yf , . . . ,y'^Y '^^^ estimated jointly as 

y = Ga, 

where G — [Gf , . . . , Gp]^ is obtained by decomposing the 
kernel matrix of all data [xi[l],xi[2], . . . ,Xp[-/V]]"^ and the 
vector a. € M*^^^ contains the expansion coefficients. The 
matrices Q and Dj^ Q in the GEV problem (|3T]) 
reduce to the M x M matrices 

p 

and 

p 

We denote this extension of the algorithm as AKCCA-I. 

V. Experiments 

We now demonstrate the performance of the proposed 
algorithm through a number of computer simulations. Several 
different SIMO Wiener systems are used throughout these 
experiments. Their linear channels are taken from Table [l] in 
which the impulse responses are chosen randomly, hi[i] e 
A/^(0, 1). Their nonlinearities are chosen from the following 
mono tonic invertible functions: 

1) fi{y) = tanh(0.82/) + O.ly, a smooth saturation; 

2) f2{y) — — 0.1sin(3y) — 0.33?;, a "stairway" function; 

3) hiv) = l-5y - 2.5 |^"^P[l^] , a smooth deadzone. 
The inverse functions of these nonhnearities can be observed 
in Fig. |5]of the first experiment. 



TABLE II 

Estimated impulse responses in experiment 1 for N = 256. Top: 
AKCCA (blind). Bottom: supervised KCCA from (21). 



i 


/i,[0] 


h^[l\ 


h,[2\ 


/ii[3] 




1 


0.4145 


0.4171 


0.2285 


-0.0235 


-2.1961 


2 


-0.5740 


0.1034 


-0.1303 


-0.4153 


0.6655 


3 


1.4271 


0.6466 


-0.9483 


-0.1655 


-0.2502 


1 


0.4115 


0.4165 


0.2249 


-0.0232 


-2.1971 


2 


-0.5742 


0.1019 


-0.1271 


-0.4151 


0.6663 


3 


1.4256 


0.6458 


-0.9508 


-0.1657 


-0.2512 



The parameters of the AKCCA algorithm are set as follows: 
A Gaussian kernel is used with a different kernel width ai for 
each channel. The kernel widths are chosen using Silverman's 
rule 148, Section 3.4.2], 

in which N is the number of data points and A = min(c?, (53 — 
(ji)/1.34) is the minimum of the empirical standard deviation 
d and the data interquartile range scaled by 1.34. The kernel 
matrix decompositions from Eq. ( [T9] l are obtained by applying 
ICD 1*461 on the available data Xi[ri]. The precision of ICD 
is chosen as 10^*, resulting in values of M within the range 
10 < ill < 50 for all experiments. A standard regularization 
coefficient of c = 10^^ is fixed. Convergence of the AKCCA 
algorithm is assumed when the change in cost between two 
iterations is less than 10^^". 

A. Experiment 1: System identification 

In the first experiment we study the influence of the number 
of data, N, on the identification performance of the proposed 
algorithm. We also compare some results to a related super- 
vised method. 

An i.i.d. Gaussian signal is used as the input to a 1 x 3 
Wiener SIMO system, with linear channels hi, h2 and ha 
(from Table [l| and nonlinearities /i, /2 and f-^. No noise is 
assumed in first test. We perform system identification with 
AKCCA for input signals of three different sizes, N — 16, 
TV = 64 and iV = 256. 

The results of AKCCA are shown in Fig. |5] Each column 
of plots shows the three estimated inverse nonlinearities gi 
corresponding to one value for N, and the last column shows 
the estimated impulse responses of the linear channels for 
N = 256. In order to account for the unknown scaling factor 
that is inherent to Wiener system identification, all estimates 
were scaled to obtain the same norm as their true values. 
While perfect system identification is only possible for source 
signals of infinite length, it is clear that by forcing smoothness 
onto the solution through a small amount of regularization 
the number of data to reach an acceptable solution is fairly 
low: Reasonable estimates are obtained for > 64 in this 
experiment. 

Table |ll] displays the impulse responses for N — 256 as 
estimated by AKCCA. As a benchmark, we include the esti- 
mates obtained by the supervised KCCA-based identification 
algorithm method from Ell in the lower part of this table. This 
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1 
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Fig. 5. Estimated inverse nonlinearities and linear cliannels for a 1 X 3 Wiener system witli different numbers of data A^. Each row sliows the estimates for 
one branch of the system; the solid line in the first three plots represents the true nonlinearity, and the dots indicate the true x versus the estimated y-values. 



algorithm is performed in batch mode, on each subchannel 
individually. As can be observed, the performance of the 
proposed blind algorithm is fairly close to the performance 
of this related supervised technique. 

In order to study the convergence of the proposed AKCCA 
algorithm we plot its equalization MSE versus the number 
of iterations in Fig. [6] Equalization is carried out here by 
performing zero-forcing on the system identification result 
previously obtained. In addition to the noiseless scenario we 
also include results for the case of 20 dB SNR. As can be 
observed, the algorithm typically converges in few iterations. 
For the noiseless case, convergence times on a 3GHz 64-bit 
Intel Core 2 PC with 4 GB RAM running Matlab R2009b 
totaled respectively 0.34s, 0.46s, 0.72s, 1.67s and 3.05s, for 
N ranging from 64 to 1024, as in Fig. [6] 

B. Experiment 2: Comparison of equalization performance 

In the second experiment we examine the performance of 
the proposed algorithm on the problem of blind equalization 
of SIMO Wiener systems. The length of the source signal 
is fixed as = 256. The chosen SIMO Wiener system 
consists of the channels hi, h.2 and hs and an identical 
nonlinearity, /i, in each branch. Since all SIMO branches 
share the same nonlinearity, we can compare the performance 
of the standard AKCCA algorithm to the performance of 
AKCCA-I, which exploits this property. Different amounts 
of additive Gaussian white noise are added to the output of 
the system. In order to perform equalization, zero-forcing is 




CO 

M 



0123456789 10 
iterations 




■N = 64 
■N= 128 
N = 256 
-N = 5I2 
■ N = 1024 



0123456789 10 
iterations 



(a) no noise 



(b) 20 dB SNR 



Fig. 6. Convergence of equalization MSE of AKCCA in experiment 1. 

performed after convergence is reached (see Algorithm [TJ. 
We compare the equaUzation performance of the following 
algorithms: 

1) CCA on linear SIMO system: As a benchmark, we 
apply the blind linear CCA-based equalizer with zero- 
forcing from ||32J on a system that only contains the 
linear channels hi, \\2 and ha. 

2) CCA on SIMO Wiener system: The same blind linear 
method is applied to the chosen SIMO Wiener system. 

3) AKCCA on SIMO Wiener system: The proposed 
algorithm, applied to the SIMO Wiener system. 
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Fig. 7. Equalization MSE of CCA and KCCA-based algorithms on linear 
SIMO systems and SIMO Wiener systems. 



4) AKCCA-I on SIMO Wiener system: The extension of 
the proposed algorithm that takes into account that the 
nonlinearities are identical (see Section [IV-B| i. 
Fig. [7] shows the equalization MSE, calculated between the 
true input signal and the estimated input signal. Averages 
are taken over 100 independent Monte-Carlo simulations. The 
results indicate that the proposed algorithms AKCCA and 
AKCCA-I show good overall performance, and AKCCA-I 
obtains an advantage over AKCCA at high SNR values starting 
at 40 dB. 

C. Experiment 3: Influence of input characteristics and num- 
ber of output channels 

In the last experiment we study the performance of the pro- 
posed AKCCA algorithm for different system configurations 
and input signal characteristics. 

We perform three tests with different input signal charac- 
teristics. In each test we compare the performance on three 
systems with different numbers of output channels. These 
systems have 2, 3 and 4 output channels, respectively, whose 
linear channels are taken in order from Table |l] and whose 
nonlinearities are all chosen as /i. The length of the source 
signal is fixed as = 256. 

The availability of extra channels allows to exploit addi- 
tional spatial diversity, and thus it is expected that a better 
result will be obtained. Nevertheless, when additional channels 
are available, the number of parameters to estimate is also 
higher, which raises the computational cost and could affect 
the results if only few data were available. 

1) Gaussian i.i.d. input signal: First, we perform blind 
equalization with AKCCA on the three systems using a 
Gaussian i.i.d. input signal as in the previous experiments, 
i.e. s[n] e A/'(0, 1). The results for the final MSE after equal- 
ization are shown in Fig. |8] As can be observed, performance 
improves when channels are added, although the improvement 
per extra channel is smaller as more channels are added. In 



Fig. 8. Equalization MSE of AKCCA with a Gaussian i.i.d. input signal. 
TABLE 111 

Average execution times on a 3GHz 64-bit Intel Core 2 pc with 
4 GB RAM RUNNING Matlab R2009B. 





dB SNR 


30 dB SNR 


60 dB SNR 


2 channels 


0.52s 


0.14s 


0.11s 


3 channels 


1.25s 


0.40s 


0.27s 


4 channels 


2.89s 


0.84s 


0.63s 



Table III the average execution times are displayed, for the 
three systems and for different amounts of SNR. As expected, 
the algorithm requires more iterations to converge in noisy 



scenarios. Note also that the computational complexity scales 
cubically with the amount of subchannels of the system, see 
Section HITgI 

2) Colored input signal: In a second test we study the 
performance of AKCCA when the input signal is colored. 
In order to color the source signal s[n] before it enters the 
SIMO Wiener system, we apply a 20-tap low-pass FIR filter 
he onto it, i.e. s'[n] — he* s[n], with cut-off frequency at 
O.Ttt radians per sample and a stopband attenuation of 60dB. 
AKCCA is then performed to retrieve the filtered input signal 
s'[n]. As the linear channels are now hardly excited in the 
stopband frequency range, it is much harder or even impossible 
to identify their complete frequency responses. Nevertheless, it 
may still be possible to estimate the colored input signal. The 
equalization MSE of AKCCA is shown in Fig.|9] Interestingly, 
the results are only slightly affected w.rt. to the previous test 
(see Fig. [8]), which demonstrates that the proposed method is 
also suitable for colored source signals, up to some extent. 

3) Binary input signal: Finally, the test is repeated on a 
system with a binary input, s[rt] e { — 1,+!}. The obtained 
BER values are shown in Fig. [TO] We include the results for 
a system that uses a fifth channel as an additional subchannel, 
hs in this case. As can be observed, the performance of the 
two-channel system is substantially improved by adding a third 
subchannel. By including additional subchannels, furthermore, 
the performance keeps improving, though slightly less for each 
channel added. 

VI. Conclusions 

We have considered the problems of blind identification and 
blind equalization of SIMO Wiener systems. These systems 
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Fig. 9. Equalization MSE of AKCCA on SIMO Wiener systems with a 
colored input signal. 
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Fig. 10. Equalization BER of AKCCA on SIMO Wiener systems with a 
binary i.i.d. input signal. 

are multi-channel Wiener systems that are excited by a single 
shared source signal. By applying a kernel transformation to 
the output data, we have obtained a linearized optimization 
problem for which we have proposed an iterative algorithm. 
The proposed KCCA algorithm iterates between a CCA algo- 
rithm to estimate the linear channels and a KCCA algorithm 
to estimate the memoryless nonlinearities. While we have 
indicated that blind identification of SIMO Wiener systems 
is not possible in general for finite source signals, we have 
also shown that identifiability becomes possible in practice by 
posing some smoothness constraints on the nonlinearities. 

We have provided a general formulation of the proposed 
technique that allows to operate on systems with two or more 
output channels. It performs well on reasonably small data sets 
and can handle systems with colored input signals. Results also 
show that it has fast convergence and it achieves identification 
performance that is very close to a related supervised method. 

Several directions for future research are open. First, it 
would be interesting to perform a theoretical analysis in 
the presence of noise on this problem. Also, some of the 
ideas proposed in this paper may be applied to other block- 



based systems such as multiple-input multiple-output (MIMO) 
Wiener systems and configurations with Hammerstein systems. 
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